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I.    INTRODUCTION 

A  data  type  is  a  class  of  objects  together  with  a  set  of  operations  which  may 
be  performed  on  these  objects.  An  abstract  data  type  is  a  precise  description  of  a 
class  of  objects  in  terms  of  the  semantics  of  the  operations  which  may  be 
performed  on  the  class  (Yurchak  [1984]). 

Given  an  abstract  data  type  and  two  formal  terms  defined  by  the  operations 
of  the  type  ,  we  consider  whether  these  two  terms  are  equivalent.  In  particular, 
we  consider  the  question  of  the  decidability  of  equality  within  an  abstract  data 
type.  This  problem  in  many  cases  reduces  to  the  question  of  whether  or  not  the 
axiom  set  for  the  data  type  is  confluent.  This  thesis  is  concerned  with  this 
second  question. 

In  the  first  chapter,  we  survey  the  historical  work  that  brought  the  Church- 
Rosser  property  into  the  literature.  The  second  chapter  introduces  the  idea  of 
confluence  and  its  relation  to  the  Church-Rosser  property.  The  rest  of  this 
chapter  is  on  theorems  related  to  confluence.  In  the  third  chapter,  we  discuss  the 
algebraic  specification  of  abstract  data  types,  which  provides  the  background  to 
move  into  the  study  of  term  rewriting  systems,  which  is  the  second  part  of  the 
third  chapter.  In  the  last  chapter,  we  discuss  an  algorithm  for  showing  that  a 
given  axiom  set  (as  rewrite  rules)  is  confluent.  This  procedure  is  called  the 
Knuth-Bendix  completion  algorithm. 
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II.    ORIGIN  OF  CONFLUENCE 


A.    IDEAS  FROM  COMBINATORY  LOGIC 


1.     Introduction 


In  this  chapter,  we  introduce  concepts  related  to  the  Church  Rosser 
theorem  as  they  are  discussed  in  Curry  and  Feys  [1958].  Combinatory  logic  is  a 
branch  of  mathematical  logic  whose  purpose  is  the  analysis  of  certain  notions  of 
such  basic  character  that  they  are  ordinarily  taken  for  granted.  These  include  the 
processes  of  substitution,  usually  indicated  by  the  use  of  variables,  and  also  the 
classification  of  the  entities  constructed  by  these  processes  into  types  or 
categories,  which  in  many  systems  has  to  be  done  intuitively  before  the  theory 
can  be  applied.  So  far  it  has  been  observed  that  these  notions,  although 
generally  presupposed,  are  not  simple;  they  constitute  a  prelogic  whose  analysis  is 
by  no  means  trivial. 

Two  questions  have  initiated  this  analysis.  The  first  of  these  is  the 
problem  of  formulating  the  foundations  of  logic  as  precisely  as  possible.  The 
second  question  is  the  explanation  of  paradoxes. 

In  order  to  get  a  better  idea  of  the  motivation  and  purpose  of 
combinatory  logic,  it  will  be  well  to  elaborate  these  points  a  little  before  we  go 
further. 

2.      Why  a  New  Functional  Notation  . 

There  is  a  lack  of  a  systematic  notation  for  functions  in  ordinary 
mathematics.  The  known  notation  /  (j* )  does  not  distinguish  between  the 
function  itself  and  the  value  of  this  function  for  an  undetermined  value  of  the 
argument  (in  fact  the  same  problem  occurs  in  Pascal  while  passing  a  function  as 
a  parameter  to  another).  This  defect  is  especially  striking  in  theories  which 
employ  functional  operations,  such  that  functions  which  admit  other  functions  as 


arguments.  For  special  operations  such  as  differentiation  and  integration  there 
are  special  notations  having  unique  meanings,  but  these  are  not  to  be  generalized. 
As  an  example,  assume  P  is  a  predicate  in  a  given  system.  If  f  (x)  is 
argument  of  P,  which  is  expressed  as  P\f  (x)]  then  what  is  P\f  (z  +  l))?  Must 
g(x)=  /(i+l)  be  formulated  first,  and  then  passed  to  P  as  P\g{x)],  or  is 
h  (x)  =  P\f  {x)\  formulated  first,  then  h  [x  +1)?  It  would  appear  that  the  results  of 
these  two  different  implementations  are  the  same.  For  some  important  operators 
it  seems  the  same,  but  is  not.    For  example,  let 

P[J  (^)\       \f    [0)  otherwise 

for  /  (i)  =  i^ 

P{g{x)\  =  P\x'^+2x+l]  =  1+2, 

h(x)^  P[f  {x)\  =   I, 

/i(z+l)  =  x+1  ^  P{g{x)\. 

For  the  second  point,  let  us  look  at  the  so  called  Russell  Paradox.  This 
may  be  formulated  as  follows:  Let  F(f  )  be  the  property  of  properties  /  defined 
by  the  equation 

F[f)=  not   /(/)  (1) 

where  not  is  the  symbol  for  negation.  Then,  on  substituting  F  for  /  ,  we  have 

F{F)  =  not  F{F).  (2) 

If.  we  say  that  F(F)  is  a  proposition,  where  a  proposition  is  something  which  is 
either  true  or  false,  then  we  have  a  contradiction.  But  it  is  an  essential  step  in 
this  argument  that  F(F)  should  be  a  proposition.  This  is  a  question  of  the 
prelogic:  in  most  systems  it  has  to  be  decided  by  an  extraneous  argument. 

Another  well-known  paradox  is: 
"I  am  lying." 

We  may  explain  the  Russell  paradox  by  claiming  the  meaning  of  F  or 
F(F)   is    "meaningless".   Thus,   as   it   is   discussed   in   Curry   and   Feys   [1958],   "in 


8 


Principia  Mathematica  (written  by  Russell  and  Whitehead)  the  formation  of 
/  (/  )  is  excluded  by  the  theory  of  types  (developed  by  Russell  and  Whitehead); 
in  some  mathematicians'  explanations  one  can  not  use  (1)  as  a  definition  of  F 
because  the  existence  of  F  cannot  be  eliminated".  Certainly  by  way  of  such 
restrictions  we  can  eliminate  paradoxes  from  a  given  system.  But,  as  we  will 
discuss  in  following  paragraphs,  there  is  something  about  the  preceding  argument 
which  is  not  explained  by  such  exclusions. 

As    stated    in    Curry    and    Feys    [1958],   the   following    requirements    are 
necessary  to  reach  the  objectives  we  have  already  discussed: 

(a)  There  will  be  no  distinction  between  different  categories  of  entities,  so  any 
construct  formed  from  the  primitive  entities  by  means  of  the  allowed 
operations  must  be  meaningful  such  that  it  is  acceptable  as  an  entity; 

(b)  There  will  be  an  operation  corresponding  to  the  application  of  a  function  to 
an  argument; 

(c)  There  will  be  an  equality  with  the  usual  properties; 

(d)  The  system  must  be  combinatorially  complete,  such  that  any  function  we  can 
define  intuitively  by  means  of  a  variable  can  be  represented  formally  as  an 
entity  of  the  system. 

By  means  of  these  four  requirements.  F  defined  by  (1)  is  certainly  significant, 
and  also  the  equation  (2)  is  intuitively  true.  In  fact  this  is  what  we  have  to  get, 
since  we  can  not  "explain"  a  paradox  by  getting  rid  of  it.  Instead,  as  Curry  and 
feys  stated  "stand  and  look  it  in  the  eye"  then  we  will  force  them  into  the  open, 
where  we  can  analyse  them.  To  me,  our  expectations  from  this  analysis  must  be 
to  find  a  way  to  show  that  functions  like  F  (F  ]  in  (2)  are  not  in  the  category  of 
propositions.  This  will  be  the  main  objective  in  the  field  of  combinatory  logic  as 
explained  in  the  following  paragraphs.  Our  purpose  for  the  analysis  is  twofold. 
As  stated  in  Curry  and  Feys  [1958],  the  first  step  is  the  analysis  of  the 
substitution  processes,  without  considering  the  classification  of  entities  into 
categories.  The  second  part  is  the  introduction  of  the  machinery  for  effecting  a 
classification  into  categories. 
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In  our  analysis,  a  basic  role  is  played  by  certain  operators  which 
represent  combinations  as  functions  of  the  variables  they  contain.  The  definition 
of  a  combinator  is  as  follows  (from  Curry  and  Feys  [1958])  "  the  combinations  in 
question  are  those  formed  from  the  variables  alone  by  means  of  the  operation 
postulated  in  the  second  of  the  above  demands.  By  the  requirement  of 
combinatorial  completeness,  these  operators  are  represented  by  certain  entities  of 
the  system.  These  entities,  and  combinations  formed  from  them  by  the  postulated 
operation,  are  called  combinators. 

The  term  'combinatory  logic'  ^  is  intended  to  describe  a  part  of 
mathematical  logic  which  requires  reference  to  combinators,  including  all  that  is 
necessary  for  an  adequate  foundation  of  the  more  usual  logical  theories. 

The  combinators  themselves  may  be  defined  in  terms  of  an  operation  of 
abstraction,  or  certain  of  them  may  be  thought  of  as  primitive  ideas  and  the 
others  defined  in  terms  of  them.  If  we  consider  an  operation  of  abstraction,  this 
leads  us  to  the  calculus  of  lambda-conversion  of  A.  Church,  and  various 
modifications  of  it;  the  second  idea  leads  us  to  the  (synthetic)  theory  of 
combinators.  It  is  the  synthetic  theory  which  gives  the  ultimate  analysis  of 
substitution  in  terms  of  a  system  of  extreme  simplicity.  Before  introducing 
Church's  calculus  of  A-conversion,  we  will  discuss  the  notion  of  a  formal  system 
first. 

B.    FORMAL  SYSTEMS 

1.      Axiomatic  Systems 

To  get  a  first  idea  of  a  formal  system  we  start  with  elementary  geometry 
as  taught  in  secondary  schools  (the  example  is  taken  from  Curry  and  Feys 
[1958]). 

Elementary  geometry  begins  with  certain  primitive  statements,  called 
axioms,  which  are  accepted  without  proof.  From  these  axioms  all  other  accepted 
statements   are   deduced   according  to   logical   rules  assumed   without   discussion. 


The  choice  of  the  term  combinatory  instead  of  combinatorial  is  therefore  in  agreement  with 
Oxford  English  Dictionary. 
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The  theorems  are  the  axioms  and  the  statements  deduced  from  them.  As  we  will 
realise,  the  system  is  in  fact  based  on  axioms  we  have  chosen,  so  the  system  is 
called  an  axiomatic  system. 

For  a  given  theory,  the  statements  have  to  deal  with  some  certain 
concepts,  and  some  of  them  may  be  left  undefined,  since  they  are  assumed  to  be 
intuitively  clear.  If  statements  or  axioms  are  left  undefined,  this  is  because  they 
are  assumed  to  be  intuitively  understandable  (we  do  not  have  to  show  a  long 
proof  for  doing  a+b+c  =  c+a+b,  by  way  of  commutativity,  since  it  is  intuitively 
evident).  The  theorems  including  these  undemonstrated  axioms  or  statements 
inherit  their  intuitive  meaning. 

As  is  well  known,  such  concrete  deductive  theories  have  been  superseded 
by  'pure'  deductive  theories.  Here  undefined  terms  are  never  tied  to  an 
interpretation.  Undemonstrated  statements  claim  no  evidence,  as  they  do  not 
even  have  presupposed  intuitive  meanings;  they  are  assumed  quite  arbitrarily, 
and  the  theorems  derived  from  them  take  part  of  their  arbitrary  character.  A 
theory  of  this  character  we  shall  call  an  abstract  (or  pure)  axiomatic  system. 

2.     Transition  to  Formal  System 

Even  in  such  a  pure  axiomatic  theory  there  is  always  a  naive  element, 
since  the  theory  is  formalized  in  terms  of  logical  concepts  supposed  to  be 
intuitively  clear,  and  the  deductions  are  made  because  of  logical  rules  whose 
validity  is  supposed  to  be  intuitively  evident.  If  we  remove  this  last  naive  element 
we  arrive  at  what  we  call  a  formal  system. 

A  formal  system  is  essentially  a  set  of  theorems  generated  by  precise 
rules  and  concerning  unspecified  objects.  The  determination  of  the  validity  of  a 
statement  in  such  a  system  does  not  require  any  experience,  nor  does  it  require 
any  a  previously  known  principles,  not  even  those  of  logic.  We  should  simply  be 
able  to  understand  the  symbols  employed  in  a  precise  way,  as  we  use  them  in 
mathematics. 

The  statements  which  the  formal  system  formulates  we  will  call  its 
elementary  statements,  those  which  it  asserts  its  elementary  theorems.  The 
elementary  statements  are  about  unspecified  objects  which  we  call  the  obs  of  the 
formal  system  (Curry  and  Feys  [1958]). 
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3.     Example  of  a  Formal  System 

Let  us  consider  a  very  simple  example  of  a  theory,  which  we  will  call  the 
elementary  theory  of  numerals.^  The  obs  of  this  elementary  theory  will  be 
0,  0  ,  0  ',.■■  etc.  Elementary  statements  will  be  equations  between  the  obs,  e.g. 
0  =  0,  0  =  0  .  We  take  as  axiom  0=0,  and  as  a  rule  of  derivation  "If  two  obs  are 
equal,  their  successors  are  equal".  We  can  then  derive  elementary  theorems  such 
as  O'  =  O',  o"  =  o". 

Let  us  now  state  this  theory  more  formally.  We  have  to  consider: 

a.  Obs  (objects). 

(1)  One  primitive  ob  :  0. 

(2)  One  unary  operation,  indicated  by  priming. 

(3)  One  formation  rule  of  obs:  If  x  is  an  ob,  then  x    is  an  ob. 

b.  Elementary  statements. 

(1)  One  binary  predicate:  =. 

(2)  One  formation  rule  of  elerrientary  statements: 

If  X  and  y  are  obs,  then  z  =  y  is  an  elementary  statement. 

c.  Elementary  theorems. 

(1)  One  axiom:  0=0. 

(2)  One  rule  of  deduction:  If  x  =  y  then  x    =  y  . 

These  conventions  constitute  the  definition  of  the  theory  as  a  formal  system  in 
the  above  sense. 

The  elementary  theorems  of  this  system  are  precisely  those  in  the  list: 

0=0, 

O'  =  o', 

O"  =  o", 


Curry  and  Feys  J1958;.  pp  13-14. 
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These   are   true   statements   about   the   system.   But   once   the   system   has  been 
defined,  we  can  make  other  statements  about  it,  e.g.  the  statement 

If  y  is  an  ob,  then  y  =  y 

is  a  true  statement  about  the  system,  although  not  an  elementary  theorem.    That 
is  an  example  of  what  we  will  call  an  epitheorem. 
4.     Definition  of  a  Formal  System 

We  define  a  formal  system  by  a  set  of  conventions  which  we  call  its 
primitive  frame.    This  frame  has  three  parts: 

(a)  a  set  of  objects  which  we  call  065, 

(b)  a  set  of  statements,  which  are  called  elementary  statements  concerning  these 
obs, 

(c)  the   set   of  those   elementary   statements   which    are   true,   constituting   the 
elementary  theorems. 

In  the  first  part,  the  primitive  frame  enumerates  certain  primitive  obs  or 
atoms,  and  certain  primitive  operations,  each  of  which  is  a  mode  of  combining  a 
finite  sequence  of  obs  to  form  a  new  ob.  It  also  defines  rules  by  the  criteria  that 
"further  obs  are  to  be  constructed  from  the  atoms  by  the  operations".  Then  we 
come  to  the  point  that  the  obs  of  the  system  are  precisely  those  formed  from  the 
atoms  by  the  operations  according  to  the  rules;  furthermore  obs  constructed  by 
different  processes  are  distinct  as  obs. 

In  the  second  part,  the  primitive  frame  enumerates  certain  [primitive) 
predicates  each  of  which  is  a  way  of  forming  a  statement  from  a  finite  sequence  of 
obs.  It  also  defines  the  rules  according  to  which  elementary  statements  are 
formed  from  the  obs  by  these  predicates.  Then  we  will  consider  that  the 
elementary  statements  are  precisely  those  so  formed. 

Since  the  first  two  parts  of  the  primitive  frame  have  features  in  common, 
it  is  rather  logical  to  consider  them  together,  and  to  extend  terminology  which 
can  be  applied  to  either.  Thus  the  considerations  based  on  the  two  parts  together 
constitute  the  morphology  of  the  system;  the  rules  of  the  morphology  constitute 
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the  formation  rules;  and  the  atoms,  operations  predicates,  taken  collectively, 
constitute  the  primitive  ideas.  The  morphological  part  of  the  primitive  frame 
then  enumerates  the  primitive  ideas  and  enunciates  the  formation  rules.  To 
consider  simultaneously  the  properties  of  the  operations  and  predicates  we  group 
them  together  as  functives.  Thus  each  functive  has  a  certain  finite  number  of 
arguments;  this  number  will  be  called  its  degree.  As  usual,  functives  of  degree 
one  will  be  called  unary,  those  of  degree  two  binary,  and  so  on.  Given  an  n-ary 
functive,  the  ob  or  statement  formed  from  n  obs  by  that  functive  will  be  called  a 
closure.  Occasionally  it  is  acceptable  to  think  of  the  functives,  as  predicates  of 
degree  0,  certain  unanalyzed  primitive  statements.  (The  terminology  is  from 
Curry  and  Feys  [1958]) 

The  third  part  of  the  primitive  frame  states  the  axioms  and  deductive 
rules  of  the  system.  Axioms  are  elementary  statements  stated  to  be  true 
unconditionally.  There  may  be  a  finite  list  of  these  or  they  may  be  given  by  rules 
determining  an  infinite  number  in  an  effective  manner  (e.g.  by  axiom  schemes). 
The  deductive  rules  specify  how  theorems  may  be  derived  from  the  axioms.  The 
elementary  theorems  are  the  axioms  together  with  the  elementary  statements 
derived  from  them  according  to  the  deductive  rules.  In  contradistinction  to  the 
morphology,  considerations  depending  essentially  to  the  third  part  of  the 
primitive  frame  will  be  called  theoretical;  taken  collectively,  they  constitute  the 
theory  proper. 

There  is  a  large  intersection  between  the  notion  of  a  formal  system  and 
an  abstract  algebra  in  ordinary  mathematics.  Therefore  we  had  better  emphasize 
certain  differences.  In  an  algebra,  we  start  with  a  set  of  elements  and  a  set  of 
operations.  The  elements  and  the  operations  that  establish  correspondences 
among  them  are  explain-ed  as  existing  in  advance.  The  sequences  generated  by 
Them  are  called  terms.  Given  a  term  of  n  elements,  an  operation  of  degree  n 
"assigns"  to  this  term  one  of  the  elements  as  a  "value".  The  case  n  —  0  is 
accepted  as  a  "fixed  element"  or  "constant"  (for  example  0  in  the  naturals). 
These  fixed  elements  are  not  analogous  to  the  atoms;  because  it  is  not  the  rule, 
but  the  exception,  that  all  the  elements  are  obtained  from  the  fixed  elements  by 
the  operations.  Moreover,  equality  is  taken  for  granted  and  it  often  happens  that 
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the  same  element  may  be  obtained  by  the  operations  in  many  different  ways.  In 
this  sense  the  notion  of  a  formal  system  is  totally  different.  What  is  given  is  not 
a  set  of  elements  but  the  atoms  and  the  operations,  and  the  obs  are  generated 
from  them.  As  we  stated  earlier,  obs  may  be  obtainable  from  the  atoms  by  the 
operations,  but  different  processes  used  in  construction  of  obs  result  in  different 
obs.    So  an  ob  can  be  considered  as  a  process  of  generation. 

5.     Variables 

The  construction  of  a  formal  language  has  to  be  explained  in  a 
communicative  language  understood  by  both  the  speaker  and  the  listener.  Let  us 
call  this  language  the  U-language.  In  earlier  sections,  words  such  as  'statement', 
'ob',  'operation',  'theorem'  which  are  used  in  the  presentation  of  elementary 
system  of  numerals,  are  words  which  are  supposed  to  have  meaning  in  the  U- 
language  before  the  formal  system  is  introduced.  But  symbols  such  as  '0',  "'/  =  ' 
are  new  and  they  are  not  in  the  U-language.  Let  us  call  the  language  in  which 
these  symbols  are  the  elementary  symbols  the  A-language.  (see  Curry  and  Feys 
[1958]  for  details) 

The  word  'variable'  has  two  different  meanings.  First,  a  variable  is  a 
symbol  or  expression  of  the  U-language  called  an  intuitive  or  U-variable.  For 
example,  'i',  'y'  used  in  the  example  of  a  formal  system  are  U-variables.  These 
are  certainly  symbols,  not  obs,  and  a  formal  system  is  not  about  them.  Secondly, 
formal  systems  can  have  the  category  of  atoms  called  'variables'  in  the  primitive 
frame.  These  are  called  formal  variables.  So  a  formal  variable  is  not  a  symbol, 
but  an  ob. 

Three  kinds  of  formal  variables  are  (a)Indeterminates,  (b)Substitutive 
variables,  and  (c)Bound  variables. 

-  An  indeterminate  is  an  atom  concerning  which  the  primitive  frame  specifies 
nothing  except  that  it  is  an  ob. 

-  Substitutive  variables  are  those  with  respect  to  which  there  is  a  rule  of 
substitution.  Such  a  rule  requires  that  a  class  of  obs  be  specifie'd  for  which 
arbitrary   obs   or   obs   of  a   certain   kind   may   be   substituted   under  certain 
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circumstances.  Substitutive  variables  are  not  indeterminates  since  they  play 
a  role  with  respect  to  the  substitution  rule. 

In  a  syntactical  system  one  explains  substitution  in  terms  of  actual 
replacement  of  a  symbol  by  an  expression.  In  a  formal  system  substitution  is 
an  operation  on  obs  which  has  to  be  defined  abstractly.  We  are  not  going  to 
explain  it  in  more  detail  here. 

(c)  A  system  contains  bound  variables  just  when  there  is  formulated  a  set  of 
substitutive  variables  and  at  least  one  proper  operation  in  which  these 
variables  play  a  special  role. So  in  a  formalization  of  integral  calculus 
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the  variable  i  is  bound.  As  we  see,  bound  variables  are  used  when  we  have 
arguments  which  are  to  be  interpreted  as  functions.  Bound  variables  have  all 
the  complexities  of  substitutive  variables  and  some  others  additionally. (for 
details  see  Curry  and  Feys  [1958]) 

Indeterminates  and  substitutive  variables  together  are  called  free 
variables.  In  other  words,  every  variable  which  is  not  bound  is  free  variable. 
Substitutive  variables  and  indeterminates  have  much  in  com.mon.  In  fact, 
substitutions  of  arbitrary  obs  for  the  free  variables  are  possible  in  either  case. 

6.     Monotone  Relations 

A  monotone  relation  is  a  relation  R  such  that 

X  n  Y  =  =  ->  ^   R  j5 

whenever  B  is  the  result  of  replacing  an  occurence  of  a  component  .Y  of  A  by  }'. 
A  monotone  relation  which  is  irreflexive  and  transitive  will  be  called  a  monotone 
quasi- ordering;  if,  in  addition,  it  is  symmetric  it  will  be  called  an  equivalence.  Let 
Rq  he  a.  given  relation,  then  the  monotone  quasi-ordering  generated  by  Rq  is  the 
relation  R  defined  by  the  properties  [p],  (r),  together  with  (e )  .Y  Rq  Y  ->  A'  R  }' . 
The  monotone  equivalence  generated  by  Rq  is  that  defined  by  these  postulates 
together  with  [s)  (in  the  next  section  we  define  this  properties). 
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C.    CALCULUS  OF  A-CONVERSION 


A-notation 


Here,  we  describe  the  A-notation  originated  from  the  calculus  of  A- 
conversion  by  A.  Church  and  J.B.  Rosser  (see  Church  [1941]).  To  do  that  we  first 
must  remember  that  a  function  is  a  law  of  correspondence,  i.e.,  a  class  of  ordered 
couples,  and  that  to  indicate  the  function  we  must  indicate  both  elements  of  each 
couple.  If  we  abbreviate  an  expression  by  M  containing  x  which  indicates  the 
value  of  a  function  when  the  argument  has  the  value  i,  we  write  Xx{M)  or  Xx.M 
to  designate  the  function  itself.  Thus  Az  (z^)  means  the  function  having  z^  for 
value  if  z  is  the  value  of  the  argument.  Suppose  we  use  D  for  differentiation  and 
J  for  integration,  then  the  statements 

a)  (z-l)2  =  z2+2z+l, 

b)  z^  is    a    function   of    z. 


z 

0 


2  (fz   =  9 


will  become 

a)  Az   (z^l)2  =  Az   (z2+2z-Hl) 

b)  Az.z     is    a    function 

c)  Z)(Az.z2)  =   Az.  2x 

d)  J  (0,3,Az.z^)  =  9 

As  for  the  example  in  connection  with  (l).  if  we  let  E  be  an  operator  such  that 

£;(Az./(z))  =  Xx.f  (z-1) 

then  the  first  of  the  two  evaluations  of  P\f  (i+l)i  is  P\E[Xx.f  (z));,  the  second  is 
E\P(Xx.f  (z))j.  If  we  use  /    for  Xx.f  (z  )  these  are  P  \Ef  \  and  E^Pf  ,  respectively. 
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2.     Functional  Abstraction 

Idea  of  functional  abstraction 

The  examples  of  the  last  section  can  be  generalized.  The  idea  of  a  certain 
generalization  of  this  kind  is  implied  in  the  evaluation  of  D  ,J ,  P  and  E  as 
functions;  they  are  functions  whose  arguments  are  other  functions;  except  for  J , 
their  values  are  also  functions. 

As  stated  earlier  we  use  Xx.M  to  denote  a  function  itself.  The  formation 
of  Xx.M  from  x  and  M  is  called  functional  abstraction.  For  functions  of  several 
arguments  we  might  similarly  define  n-ary  functional  abstraction  as 

(1)  A"z„  .  .  .  ,x„.M 

which  means  the  function  whose  value  is  M  when  the  arguments  are  xi,  .  .  .  ,  x„ . 

Certain  assumptions  are  very  important.  For  example,  let  us  take 
addition,  whose  value  is  z  +  y  for  the  argument  x  and  y  .  If  we  regard  z  as  a  fixed 
value,  the  function  Xy{x^y)  (or  Xy.x+y)  will  stand  for  the  operation  of  adding 
the  argument  to  x  .  If  we  use  the  generalized  concept  of  a  function,  this  can  be 
regarded  as  itself  the  value  of  a  function  of  z .  This  will  correspond  to  our 
conventions  as  Ax  (Ay  (z  --y  ))  or  Ax  Ay.x  +y  .    We  can  adopt  the  definition: 

A  xy.x-{-y   =  Ax.Ay.z+y 
If  we  assume  M  =  Xy.x  +y  then  the  above  equation  will  become 

A'z.M  =  Xx.M 
In  general 

A^-'z,,  .  .  .  ,x,y.M  =  X'^x,,  .  .  .  ,x,(Xy.M) 

Thus  we  can  express  fimctions  of  any  number  of  arguments  by  means  of  simple 
functional  abstraction.  From  here  on,  the  exponent  of  A  will  be  omitted  for  the 
sake  of  simplicity. 
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Bound  variables  and  functional  abstraction 

A  system  contains  bound  variables  when  there  is  a  formulated  set  of 
substitutive  variables  (i.e.,  i  in  /  (i)=z^+2z)  and  at  least  one  proper  operation  in 
which  these  variables  play  a  special  role. 

Let  us  call  the  proper  operation  mentioned  above  a  binding  operation. 
Other  operations  will  be  considered  as  ordinary  operations. 

Any  binding  operation  can  be  defined  in  terms  of  a  functional  operation 
and  an  ordinary  operation.  For  example,  let  /  be  a  primitive  binding  operation 
with  m  binding  arguments  and  n  ordinary  arguments,  shown  as 

/  (zj,  .  .  .  ,  z„,Mi,  .  .  .  ,  M„) 

where  z,  is  a  binding  argument,  M^  is  an  ordinary  argument.  Let 
M/  =  Azi,  .  .  .  ,  z„  .Mj  and  let  F  be  a  new  ordinary  operation  of  n  arguments. 
Then  the  above  primitive  binding  operation  /  will  become  the  ordinary  function 
F(Mj  ,  .  .  .  ,  A/„  ).  So  by  way  of  bound  variables  and  functional  abstraction  we  are 
able  to  use  functions  as  arguments. 
3.      Conversion  Rules 

In  this  section,  we  will  consider  how  to  formulate  an  equality  relation  in 
the  system. 

As  a  relation,  equality  is  supposed  to  satisfy  the  following  properties  : 

X  R  X  (Reflexiveness)(/)) 

XRF=  =  =  =  >FR:^  (Symmetry)  (a) 

A'RF    &     YRZ    =  =  =  =  >  A'  R  Z  (Transitivity)  (r) 

X  R  Y    =  =  =  =>    A'Z  R  YZ  (Right  monotony)  (u) 

X  RY    =  =  ^  =  >    ZX  R  ZY  (Left  monotony)(//) 

A  relation  that  is  left  and  right  monotone  is  called  a  monotone  relation.    In  order 
to  have  the  replacement  theorem  we  must  also  have  the  rule  , 
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Since  these  properties  of  equality  will  not  complete  all  the  properties  of  it,  we 
need  certain  other  principles.  The  following  section  will  describe  those  which  are 
defined  in  Church  [1941]  and  discussed  in  Curry  and  Feys  [1958]. 

^-conversion  rules 

If  we  consider  the  meaning  of  bound  variables,  it  is  clear  that  they  are 
irrelevant;  the  correspondence  is  the  same  no  matter  what  variable  is  used  to 
indicate  it.    Thus  we  should  like  to  have  the  axiom  scheme 

Xx.X  =  Ay.  \y  /  x\X 

where  \y  /  x]  means  substitution  of  y  for  x.  However,  as  can  be  realized,  this 
scheme  will  create  some  confusion.  Let  us  look  at  the  following  example. 

If  X  were  xy ,  the  above  equation  would  be 

Xx.xy  =  Xy.yy 

where  two  sides  obviously  do  not  have  the  same  meaning.  This  situation  is  called 
confusion  of  bound  variables.    In  another  example 

I       6ry    dx    -   27y 

if  we  change  variable  x  to  y  then  equation  should  become 

J^    6y2  dy   =   27y 

which  is  false. 

To  get  rid  of  this  confusion  of  variables,  we  add  some  restrictions  on  the 
scheme  such  that 

(q)      If  y  is  not  free  in  A'  then 

Xx.X   =   Ay.  [y  /  x\X  . 

In  the  next  step,  if  Xi.M  is  the  function  whose  unspecified  value  is  M,  then  its 
application  to  any  N  must  be  the  same  as  the  result  of  substituting  A'^  for  x  in 
M ,  as  shown  in  the  formulation 
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(/3)  {\x.M)N  =  [N/  x\M. 

Here,  there  is  a  possibility  of  confusion  of  variables  to  be  gotten  rid  of.  Assume 
M  =  Xy.xy  and  TV  =  y .  Then  substitution  of  N  for  z  in  M  without  considering  the 
bound  variables  would  lead  to  Xy.yy .  But  if  we  first  transform  M  to  Xz.xz  by  (a) 
and  then  substitute,  the  result  will  be  Xz.yz .  This  kind  of  confusion  may  occur  if 
there  is  a  free  variable  in  N  which  is  bound  in  M.  This  possibility  can  be  solved 
by  adding  a  restriction  to  (/?).  But  if  we  change  the  definition  of  substitution  in 
such  a  way  that  bound  variables  are  shifted  automatically  so  as  to  avoid 
confusion,  then  (/?)  may  be  accepted  without  restriction. 

General  Concept 

The  monotone  equivalence  generated  by  [a]  and  {j3)  is  called  /3- 
convertibility,  that  generated  by  (a)  alone  is  called  a-convertibility.  Besides 
equivalence,  the  monotone  quasi  ordering  which  is  called  reducibility  is  also  used 
in  Church's  theorems.  We  will  symbolize  it  as  ^.  The  monotone  quasi-ordering 
which  we  will  use  in  the  Church  Rosser  theorem  is  called  reducibility.  Conversion 
is  a  transformation  of  an  ob  into  one  with  which  it  is  convertible.  The  ob  which 
can  not  be  reduced  in  any  way  is  in  normal  form.  Certainly  there  are  different 
kinds  of  reduction  as  in  the  case  of  conversion.  Since  a  is  symmetric  a-reduction 
and  a-conversion  are  the  same.  A  reduction  is  a  transformation  of  an  object  into 
which  it  is  reducible.  The  converse  transformation  is  called  an  expansion. 

17 -conversion  Rules 

This  conversion  rule  says: 

If  I  is  not  free  in  M ,  then  Xt  [Mi  )  R  M 

This  rule  is  intuitively  acceptable  for  convertibility,  because  both  sides  of  the 
relation  represent  the  function  whose  value  for  the  argument  A'  is  MA' .  On  the 
other  hand  there  are  purposes  for  which  the  rule  is  not  acceptable,  because  the 
left  side  is  a  function,  while  the  right  side  may  not  be.  But  this  is  matter  of 
interpretation;  because  in  general,  every  object  is  a  function  too. 
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The  rules  (f)  and  (r/)  together  are  equivalent  to  the  following  rule  which 
is  a  form  of  the  principle  of  extensionality: 

(f )       If  2  is  not  free  in  either  M  or  TV ,  then 

Mx  ^  Nx  =  =  ^  =  >  M  =  N 
The  rule  (^ )  follows  from  {^)  and  [rj)  as  a  result: 

Mx  =  Nx  ^  =  =  =  >  Xx[Mx)  ^  \x{Nx)  by  (^) 

=  =  =  =  >M  =  7V  by  (7;) 

Conversely  (77)  and  (^)  follow  from  (^ )  together  with  (/?)  thus: 
{\x.Mx)x  =  Mx  by  [p) 

Xx.Mx  =  M  by  (f) 

This  proves  [tj).  To  prove  (^)  we  have 

M  ^  N  =  =  =  =  >  {XxM)x  =  (XxN)x        by  {l3) 
=  =  =  =  >  XxM  =  AziV  by  (77) 

We  call  the  lambda-conversion  calculus  with  the  rule  [tj]  the  /^t; -calculus. 

Redexes 

The  terminology  introduced  here  simplifies  many  of  the  succeeding 
formulations. 

We  call  an  object  which  can  form  the  left  side  of  an  instance  of  one  of 
the  rules  (/?),  (77),  or  (<5 )  (introduced  later)  a  redtx  of  the  corresponding  type,  the 
rightside  of  the  same  instance  will  be  called  the  coTttractum  of  the  redex.  A 
replacement  of  redex  by  its  contractum  will  be  called  a  contraction  of  the  type  of 
the  rule.  Thus  a  redex  of  type  [d],  or  simply  /3-redex  is  an  object  of  the  form 
(  Xx.M  )N ,  its  contractum  is  [N  /  x]M  \  and  a  replacement  of  an  instance  of 
(  Xx.M  )N  by  [TV/  x\M  is  a  /^-contraction. 
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tf -conversion  rules 

This  is  a  third  kind  of  reduction  and  a  third  type  of  A-calculus,  giving  a 
rule  of  conversion  of  the  following  kind: 

[s]   Let  M   be  an  ob  which  is  not  /9-redex  and  not  of  the  form  Xx.N ,  and  let  M 
contain  no  free  variables  nor  any  proper  components  which  are  redexes  of  any 
kind.  Let  M    be  an  object  such  that  no  constituent  of  M    is  a  free  variable  and  M 
is  not  a  redex  of  the  same  kind  as  M.  Then  M  is  convertible  into  M  . 

An  object  M  to  which  such  a  rule  may  be  applied  is  called  a  <5 -redex.  It  is 
clear  that  a  <^-redex  is  of  the  form 

aMiMj,  .  .  .  ,  M„ , 

where  a  is  a  primitive  constant  and  M,,  .  .  .  ,  M„  are  in  normal  form  and  contain 
no  free  variables. 

A  A-calculus  which  admits  a  form  of  the  rule  (<5)  along  with  (q)  and  (^) 
will  be  called  a  ^(5-calculus;  if  it  admits  also  the  rule  (77),  it  will  be  called  firjS- 
calculus,  or  simply  a  full  A  -calculus.  The  Church  Rosser  theorem  which  is 
mentioned  in  the  next  section  is  based  on  an  arbitrary  full  A-calculus. 

D.    CHURCH  ROSSER  THEOREM 

One  of  the  main  results  of  calculus  of  A-conversion  is  the  so-called  Church- 
Rosser  theorem.  This  theorem,  shortly,  may  be  stated  as  follows: 

[x]      U  X  =   F,  then  there  is  an  ob  Z  such  that  X  >  Z  &    Y  ^  Z . 

The  property  (x)  is  known  as  Church-Rosser  property.  Generally  let  '='  be  infix 
equivalence  relation  generated  by  a  relation  >.  Then  the  property  (\)  is  as 
follows: 

If  A'  =    Y  then  there  is  a  Z  such  that  X   :^  Z  &    Y  >  Z 

The  classical  Church-Rosser  theorem  is,  then,  the  following: 

If  ^  is  the  reducibility  relation  defined  earlier  for  any  of  the  forms  of  A- 
calculus  then  the  property  (x)  holds. 
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The  proof  of  this  theorem  has  been  studied  (besides  Church  and  Rosser)  by  many 
other  mathematicians.  For  references  see  Curry  and  Feys  [1958]. 

E.    CHURCH-ROSSER  AND  CONFLUENCE  PROPERTIES 
The  following  property  which  we  call  [O  )  is  implied  by  (x): 

If  for  some  U 

U  ^X  &  U  ^  Y, 
then  there  is  a  Z  such  that 

X  ^  Z  &  Y  ^  Z. 
They  are  shown  in  Figure  2.1. 

1,     Implications 

The  following  theorem  shows  that  [O]  is  equivalent  to  {\)  provided  that 
^  is  quasi-ordering. 

Theorem:  If  the  relation  holds  for  the  properties  [p]  &z  (r)  &  {&), 
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(a)  (b) 

Figure  2.1  (a)  Church  Rosser  Property,  (b)  Property 
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then  it  holds  for  [x). 
Here  what  we  will  say  is  property  [O)  is  almost  confluent.'^ 


For  proof  see  Curry  and  Feys  p.  112. 
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III.    PROPERTIES  RELATED  TO  THE  CONFLUENCE 

In  the  second  chapter  we  mentioned  the  relation  between  the  Church-Rosser 
property  and  the  confluence  property.  Here  we  will  look  at  the  confluence 
property  in  terms  of  term  rewriting  systems. 

A.    GENERAL  CONFLUENCE 

Let  a  class  of  objects  be  given,  and  a  set  P  of  object  pairs  such  that  one  is 
obtained  from  the  other  by  a  move,  and  the  two  objects  are  regarded  as 
equivalent  if  and  only  if  one  is  obtainable  from  the  other  by  a  sequence  of  moves. 
For  example,  in  group  theory  the  objects  are  words  made  from  an  alphabet 
a  ,6  ,  •  •  •  ,a"\6"'  (where  a"'  is  the  inverse  of  a]  and  a  move  is  the  insertion  or 
removal  of  a  consecutive  pair  of  letters  xx~^  or  x^^x  . 

In  Church's  A-calculus  we  define  A-conversion  as  the  reflexive  and  transitive 
closure  of  q-  and  /^-conversion  rules.  A-conversions  are  kinds  of  these  moves. 

As  we  defined  earlier,  the  moves  of  A-conversion  naturally  fall  into  two 
categories,  reductions  and  expansions.  Also  in  the  example  of  group  theory, 
cancelling  of  a  pair  of  letters  can  be  called  reduction,  the  insertion  is,  then, 
expansion.  This  dichotomy  between  reduction  and  expansion  plays  an  important 
role  in  confluence  relations. 

If  a  relation  is  transitive,  confluence  is  equivalent  to  the  Church-Rosser 
property,  which  expresses  the  fact  that  equivalence  (or  interconvertibility)  of  two 
terms  can  be  checked  by  reducing  them  to  a  common  form. 

If  A  an  5  are  "equivalent",  it  follows  that  there  exists  a  third  object  C 
obtainable  both  from  A    and  B  by  reduction  seqiiences. 

Another  problem  in  confluence  theorems  is  the  search  for  "end  forms"  or 
"normal  forms",  i.e.,  objects  which  admit  no  reduction.  In  any  theory  in  which 
the  confluence  property  holds  no  eqviivalence  class  can  contain  more  than  one 
normal  form  (see  Lemma  3.1).  However,  if  there  exists  infinite  sequences  of 
reductions  which  do  not  terminate  then  there  is  a  question  of  whether  or  not 
normal  form  exist.  In  the  following,  we  will  follow  the  term.inology  and  notation 
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found  in  Huet  [1980].  We  will  use  arrows  as  relations,  since  we  are  going  to  deal 
with  rewrite  rules  which  will  be  explained  in  the  fourth  chapter. 

1.     Notation 

Let  E  be  an  arbitrary  set.  Let  — ,  — „ ,  ^j  be  symbols  for  reduction. 
L  is  identity  relation  on  S,  that  is,  t  =  {<z,z  >  |    z  is  in  S  } 
.  is  operator  for  composition  of  relations.  So 

->a  ■-**    =   {<^,y>    I     There    is    a    z   z -^  ^z    &   z -^^  y    ) 

-»~'  is  inverse  relation  of -►,  that  is,  — "'  =  {<z,j/>  |    </— z  }. 

With  these  definitions  : 

-°  =  .. 

-►'  =  -►  U  t  (Reflexive  closure) 

^*  =  -.->•-'   ,t>0. 

—  "^  =  y.^o  ^'  (Transitive  closure) 

— '  =  -►'^  U  t.  (Reflexive,  transitive  closure)  ~" 

<-->=«-  [J  -».  (Symmetric  closure) 

If  X  is  element  of  S  and  there  is  no  y  such  that  z  ^y  ,  then  z  is  a  — 
normal  form.  Let  A^  be  the  set  of  all  such  elements.  For  y,  an  element  of  S.  if 
there  exists  an  z  element  of  .V  such  that  y  ^'  z  .  then  z  is  a  ^-normal  form  of  y. 
For  a  relation  — ,  we  let 

z^' .—  '  y  if  and  only  if  there  exists  a  z       z  -^'  z  and  y  ^'  z  . 

z  ^'  .->'  y  if  and  only  if  there  exists  a  2    |    2  -> '  z  and  z  —'  y  . 

A(z)  =  {i    I    t/iere   i'fi  y    |    z -^'  y  },  an  element  of  iV  u  {00} 

A(z)  =    {  y     i     z-^y    }. 
A-(z)  =    {  y    I     z--y   } 

A"(z)  =   A-  U  {^}- 

Relation  ^  is 

(i)   Inductive  iff  for  every  sequence  z  1-^x2^  ■  ■  ■  ^z„   ■         there  is  a  y    such 

that  for  all  2   ^l  z,^'  y 

(ii)  acyclic  iff  ^"^  is  irreflexive  (Then  -*"   is  a  partial  ordering) 

(iii)    terminating   iff   there    is    no    infinite    sequence    zi^zj—  •  ■  ■  — ^n-"  '  '  ' 
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(then  -»'  is  well  founded  which  makes  sense  in  some  mathematical  discourse. 
When  a  careful  definition  is  required,  the  inductive  definitions  are  used  to 
characterize  the  set  of  well  founded  formulas.) 

(iv)  bounded  iff  for  all  z  ,  \(x  )  <  oo  (then  -►'  has  the  finiteness  property) 

Every  bounded  relation  is  terminating  and  every  terminating  relation  is  inductive 
and  acyclic.    Let  P  be  any  predicate  on  S.  We  say  that  P  is  -^-complete  iff 
For  all  z  in  S  [For  all  y  in  A+(z)  P  (y)]  =  =  >  P  (x). 

We  say  that  -^  is  locally  finite  iff  for  all  z  in  S  A(z)  is  finite. 

Let  -►  be  a  locally  finite  relation.  For  every  z  in  S,  if  A(z)  =  oo,  then  there 
exists  an  infinite  sequence  z  =  zj  —  zj-^  •  •  •  —  2;„ -►  •  •  .  Therefore  a  locally  finite 
relation  is  bounded  iff  it  is  terminating. 

We  say  that  ^  is  globally  finite  iff  for  all  z  in  S  A'  (z )  is  finite.  A 
terminating  locally  finite  relation  is  globally  finite,  (reference  Huet  and  Oppen 
[1980]) 

2.      Confluence  Properties 

Relation  -  is  locally  confluent  iff  <-.—  is  a  subset  of  -»'.«-',  or  in  another 
words,  for  all  x  .y  ,z  there  is  u   such  that  z^y   &  x-^z  ^  =  >  y  -^'  u   &  z  ^'  u  . 

We  say  ^  is  globally  confluent  iff  for  all  x  ,y  z<-'.-^'t/  =  =  >  z ->' .•^' y  .  In 
Figure  3.1,  these  properties  are  shown.  In  this  figure,  dashed  arrows  denote 
reductions  depending  on  the  reductions  shown  by  full  arrows. 

From  now  on,  we  will  use  confluent  to  mean  globally  confluent. 

The  relation  -^  is  interpreted  as  ,5-reduction  in  A-calculus,  and  the 
operational  semantics  in  a  programming  language  (see  Huet  [1980]). 

B.    RELATION  TO  CHURCH-ROSSER  PROPERTY 

Theorem    3.1:    A    relation   is   confluent    if  and   only   if    it   has   the   Church- 
Rosser  property  (x)- 

Proof:This  can  be  shown  by  proving  that 

The  only  if  part  is  trivial  since  {(->'  .— ')}  is  subset  of  {(-►  |    •>-)'}.    For  the  if 
part     we     have     to     show     that     {(^  i    *-)'}     is     subset     of     {(-''■•^")}-       Since 
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(^a)  (b) 

Figure  3.1.  (a)  Local  Confluence  Property,  (b)  Global  Confluence  Property. 

(-►I    *-)'  ^  Ujt'^i(^  f    "")* '  tben  for  fc  =  l,  (^  |    — )  is  trivially  a  subset  of  ->'  .^* . 

We  will  prove  that  (-»[«-)'  is  a  subset  of  -*'  .*-\  by  using  induction.  To 
prove  that  (->  |  ^)*  is  a  subset  assume  it  is  true  for  all  k  ^n  hence  [J|^^■^  (^  1  ^)* 
is  a  subset  of  -»*  .^' .  Now  we  must  show  it  is  true  for  all  A;  ^  n  +1. 

If  (x,y)  is  in  U^V/  (^  i  ^)*  then  there  exists  a  z  such  that  (x.z) 
t  Ujk"=j  (^  I    *-)* ,  (z,y)  e  -^  or  ^.    Let  us  look  at  these  two  cases: 

Case  1:  (z,y)  t  <-.  Since  (x.z)  t  {Jk  =  \  (^  I  *")*  ^  by  assumption  (x,z)  t  ^'.^'. 
As  a  result  (x,y )  e  -> '  .^ '  .^  which  is  equal  to  ^'  .*-' . 

Case  2:  (z.y)  e  -^.  Since  (x.z)  e  Ua"=i  (~*  I    ^)*  ^'^  have  (x,y)  element  ->'  .^"  .^. 
By  confluence  property  ♦- '  .—  is  a  subset  of  — "  .— ' .  Therefore  (x,y)  is  in  ^'  .^    — 
which  is  equal  to  ^' .^' .        •  .,   .         ^        ■ 

This  completes  the  proof. 

C.    LEMMAS  ON  RELATIONS  WHICH  ARE  CONFLUENT 

Lemma  3.1:  If  a  relation  is  confluent  (global),  then  the  normal  form  of  any 
element,  if  it  exists,  is  unique. 
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We  can  prove  this  by  contradiction.  Assume  z  m  T,  has  two  normal  forms  y 
and  z.  By  confluence  property  {y,z)  e  ("-•-*')  inipHes  {y,z)  e  (-►'.-^').  Then 
there  is  a  u  such  that  y^'  u  &  z  ->'  u  .  But  by  definition  of  normal  form  there  is 
no  such  u  to  which  both  y  and  z  reduce.  So  y  and  z  are  the  same  object.  This 
completes  the  proof. 

We  define  a  relation  -♦  to  be  semi- confluent  if  and  only  if  ♦-.-►'  is  a  subset  of 

Lemma  2.2  :  A  relation  is  confluent  if  and  only  if  it  is  semi-confluent. 
Proof:  Let  P(k)  be: 

then  theorems  turns  into: 
P(l)  iff  P(k)  for  all  k^O. 

The  only  if  part  is  trivial.  For  the  if  part,  assume  P(l)  is  true.  By  induction  on 
k: 

(1)  P(l)  is  true  by  assumption 

(2)  Assume  P(n)  is  true, 

(3)  Show  that  P(n+1)  is  true  then  the  proof  is  done. 

Assume  (y,z)  is  an  element  of  (<-"'^'.->* ).  Then  there  is  an  z  such  that  i -^*  y  -^y 
and  x^'z.  So  by  (2)  on  (y  ,z),  there  is  a  r  such  that  y^'  v  &z  z  ^' v  .  By  (1)  on 
(y  ,v),  there  is  a  u  such  that  y  -> '  u  &z  v  -^'  u  .  Since  z  ^'  v  -^'  u  ,  z  ->*u  .  So  the  pair 
(y,z)  is  an  element  of  (  — '  .^').  This  completes  the  proof. 

D.    LOCALIZATION  OF  CONFLUENCE 

Lemma  2.3:  A  terminating  relation  is  confluent  if  and  only  if  it  is  locally 
confluent. 

Proof:  It  is  sufficient  to  prove  that  local  confluence  implies  global  confluence. 
Assume   -♦  is  locally  confluent.  Define  the  set  A    as  follows  : 

A   =  {  z    I    ■^'x— '   is  not  a  subset  of  — '  .^ '   } 

U  A  =  4>,  the  proof  is  complete.  Assume  A  is  not  empty.  Let  x  be  a  rightmost 
element  of  A  ,  i.e.,  if  x  —  y,  then  y   is  not  an  element  of  A  .  Such  elements  exist, 
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for   if  xo(  A    and   zq  is  not   rightmost  there  exists  zj  e  A,   zq  -  zi-   If  zj   is  not 
rightmost  then  there  exists  zj  e  ^    such  that  zq  —  zj  -•  z,  etc.    Thus  there  exists  a 
sequence  of  elements  in  A 
zq— z,-»z2-»  •     •  — z,  -»  •  •  • 

By  the  terminating  condition,  this  sequence  terminates  in  some  element 
xy  e  A  ,which  must  be  rightmost  in  A  . 

Consider  ♦-*  z  -►*.  Let  {y,z)  be  an  element  of  —  *  z  -»*  that  is  not  in  -»*.^*. 
Since  —  is  locally  confluent  (y,z)  is  not  in  «-.-». 

Assume  (y  ,2)  e   <-    y  <-z  -»z   -»    2  1 

By  local  confluence  (y  ,z  )  e  —  *u  — '    for  some  u.  Also  since  z  -*  y    and  x  ^  z  ,y 
and  z    are  not  in  A  .  Therefore  we  have 

(y  .2)  <  ^'  y  ->*«-*2  -' 

where  - '  y  -> '  and  ♦-  *  r  '-♦  *  are  subsets  of  -•*.«-* .  (since  (y  \z')  (  A  ] 


R 


X 


V 


w 


X 


Figure  3.2.  Localization  of  Confluence  Property 
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Assume  (y  ,u)  e  -^'  V  ^' ,  and  [u  ,z)  t  ->'  w  ^' .  Thus  (v  ,w)  e  ^'u->*.  But  since  y  ,2 
are  not  in  A  ,  and  y  -» *  u  ,  z  — '  u  then  u  is  not  in  A  .  Thus  there  exists  an  x 
such  that  (v,w)  t  -^'  z'->-' .  Then 

(y  ,z)  =  {y  ,u  ).(u  ,z)  €  ->'  V  *-'  .-*'  w  <-'  which  is  subset  of 


X   *-    *-      =    -►    I 


But  (y,z)  t  -♦'.<-*,  and  this  is  a  contradiction.  Thus  A  must  be  empty,  and  this 
completes  the  proof.  Figure  3.2  is  a  diagramatic  representation  of  the  proof. 

Corollary:  A  terminating  relation  satisfies  the  Church-Rosser  property  if 
and  only  if  it  is  locally  confluent. 

Proof:  By  Lemma  3.3  we  showed  termination  and  local  confluence  is 
equivalent  to  global  confluence,  and  by  theorem  3.1  we  know  confluence  and  the 
Church-Rosser  property  are  equivalent.  Then,  since  arrows  in  these  theorems  are 
bidirectional,  we  say  that  termination  and  local  confluence  is  equivalent  to  the 
Church-Rosser  property.  This  completes  the  proof. 
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IV.    USES  OF  THE  CONFLUENCE  PROPERTY 

Determination  of  confluence  in  a  system  is  an  integral  step  towards  deciding 
various  properties  of  a  system  that  is  formally  defined.  The  systems  handled  in 
this  chapter  are  term  rewriting  systems  associated  with  abstract  data  types.  In 
the  first  section  we  briefly  describe  algebras  to  provide  a  background.  We  discuss 
an  initial  algebra  approach  for  implementation  and  correctness  of  abstract  data 
types,  as  used  in  a  high  level  language,  (see,  for  example  Goguen  [1977]) 

A.    AN  INITIAL  ALGEBRA  FOR  ABSTRACT  DATA  TYPES 

Abstract  data  types  are  a  powerful  tool  in  programming  in  two  different 
aspects. First  it  is  convenient  for  the  user  to  think  in  abstract  terms,  and  second, 
abstraction  provides  a  means  for  discussing  software  independently  of 
implementation.  Algebras  have  been  found  to  be  a  promising  method  for  the 
specification  of  abstract  data  types  (Guttag  [1978]).  We  will  study  an 
implementation  of  abstract  data  types  (such  as  stack,  queue)  as  initial  algebras. 
We  assume  that  reader  is  familiar  with  abstraction  in  terms  of  computer  science. 

1.     What  is  an  algebra 

An  algebra  is  composed  of  two  main  parts,  the  first  one  includes  two 
subparts  which  are  the  carriers  and  the  collection  of  operators  of  the  algebra. 
The  index  set  of  carriers  may  be  one  or  more  and  is  called  the  sort  set.  If  there  is 
one  element  in  this  set,  the  algebra  on  this  set  is  called  a  one-sorted  algebra.  As 
an  example  of  a  sort  set  {real,  boolean,  integer}  is  a  set  of  the  sorts  real,  boolean 
and  integer.  But.  as  we  realize,  since  an  algebra  can  have  an  infinite  number  of 
elements,  this  is  not  enough  to  specify  an  algebra.  We  will  come  to  that  point  in 
the  specification  of  abstract  data  types.  n 

If  S  is  the  set  of  sorts  of  an  algebra,  then  the  signature  S  is  defined  to  be 
the  collection  of  sets  S„  ,  where  we  5* ,  s  e  S  that  describe  the  sets  of  operations 
of  the  form 

F:  A,  X  •  •  •  xA,    -«•  As 
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where  w  =  «i«2  ""  "  «» 
that  are  in  the  algebra. 

For  example,  the  operation  +  on  integers  may  be  denoted  as: 
+  :  intg  ,  intg     -»  intg 

and  is  an  element  of  S,„,,  ,ntg,intg-  So  if  S  is  defined  only  on  intg  (integer),  it  is  a 
one-sorted  signature.  If  an  operation  F  e  S„  ,  ,  the  arity  of  F  is  |u;|  where  |u;|  is 
number  of  sorts  in  w  .  The  sort  of  F  is  s   if  F  e  Su,,,  • 

In  a  boolean  algebra 

T  :  ^  600/  and  F  :  --  600/ 

may  be  considered  as  constants  (which  are  0-ary  functions). 

In  general  the  components  of  the  signature  S  for  integer  are  as  follows: 

Sa..   =  {  0  }, 


There  is  an  ambiguity  here,  the  operator  -  is  used  both  as  a  unary  and  binary 
operator,  namely,  as  negation  and  subtraction  respectively.  For  integers  there  is 
no  ternary,  4-ary,  etc.  operations,  so  their  sets  are  empty. 

If  two  algebras  have  different  carriers  but.  the  same  signature  S,  then 
they  are  called  S-algebras.  When  A  and  B  are  S-algebras,  >4  is  a  subalgebra  of 
B  means  that  A  is  a  subset  of  B  (for  their  carriers)  and  that  each  operation 
named  by  F  i  T.,  ,  ,  ,  in  .4  is  exactly  that  in  B  ,  restricted  to  the  carriers  of  A  ; 
such  that  a,  (  .4,  .  for  i—  1,.  .  .,n, 

FA(ai,  ■  ■  ■  ,  a„)  =   Fflla,,  .  .  .  ,  a„). 

Also  if  A  and  B  are  both  S-algebras,  a  S-homomorphism  h  .  A  -^  B  is  a  family  of 
functions  <h,  -.A,  ^  B,y         that  preserve  the  operations 
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(hO)  If  F  e  S,,,  ,  then  h.  (F^  )  =  F^  ; 

(hi)  U  F e  E,   ...,,,   and  <oi,  .  .  .  ,  a„  >  e  >1,  x  •  ■  •  x>l,  ,  then 
h.\FA{<^i,  ■■  ■,ar,)=  FBlh.^ia^),  .  .  .  ,  h,jaj]. 

A  category  C  of  S-algebras  consists  of  a  class  of  E-algebras  together  with  all  the 
S-homomorphisms  between  the  algebras. 

A  homomorphism  h  -.  A  -*  A  is  an  isomorphism  iff  there  exists 
g  :  A  -^  A  such  that  gh  =  1^  and  hg  =  l^-  where  1^  is  the  identity  function  of  A. 
The  homomorphism  g  is  called  the  inverse  of  h. 

The  basic  concept  of  this  section  is  the  following: 

An   algebra  A  is   initial  in  a  category   C  of  S-al^ebras  iff  for  every 
algebra  5   in  C  there  exists  a  unique  homomorphism    h  .  A    -*  B  . 

The  following  is  a  corollary  of  this  concept: 

If  A  and  A  are  both  initial  algebras  in  C,  then  A  and  A  are 
isomorphic.  If  yl  in  C  is  isomorphic  to  A  ,  then  A  is  also  initial 
(Goguen,  Thatcher,  and  Wagner  [1978]). 

So  the  initial  algebra  in  a  category  C  of  E-algebras  characterizes  the  isomorphism 
class  of  an  object;  and  by  the  meaning  of  isomorphism,  this  means  it 
characterizes  an  object  "abstractly",  in  terms  of  its  structure.  An  abstract  data 
type  is  the  isomorphism  class  of  an  initial  algebra  in  a  category  C  of  S-algebras. 
Thus  we  can  speak  of  an  initial  algebra  A  in  C  as  being  the  abstract  data  type. 
Certainly  the  categories  C  of  E-algebras  we  are  interested  in  are  those  which  are 
finitely  describable  (since  abstract  data  types  are  finitely  describable).  But  not 
all  the  categories  C  of  S-algebras  are  finitely  describable.  We  are  interested  in 
categories  C  having  as  objects  all  S-algebras  satisfying  some  finite  set  ^  of 
equations  (in  turn,  axioms  of  a  specification  which  we  will  describe  in  the  next 
section)  .  The  set  ^  is  the  second  part  of  an  algebraic  specification. 

Abstract  data  types  can  be  specified  by  equations,  which  are  called 
axioms  of  the  given  abstract  data  type.  We  will  next  present  the  mathematics 
needed  to  do  this. 

There  are  two  main  theorems  (in  fact  only  one  but  two  versions  for 
different  categories).  The  first  one  proposes  that  there  is  an  initial  S-algebra  in 
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the  category  C^  of  all  E-algebras  and  the  second  covers  the  category  C^  ^  of  all  E- 
algebras  satisfying  a  set  ^  of  equations.  As  you  will  realize,  the  first  is  a  special 
case  of  the  second  with  f  =  $. 

Some  examples  of  abstract  data  types  which  are  initial  in  a  category  Cj; 
are  given  below. 

Example  (1)  The  set  of  natural  numbers  is  one  of  the  most  common  data  types. 
The  E-algebra  for  it  can  be  denoted  as  follows. 

5  =  {nat},  S,.„„,   =  {0}  ,  S„,,.„,,  =  {SUCC},  E, ,.   =  $  otherwise. 

The  basic  idea  here  is  that  further  operations  on  natural  numbers  can  be 
expressed  in  terms  of  the  two  basic  ones,  SUCC  and  0. 

A  property  that  the  algebraic  approach  shares  with  all  abstract  or 
axiomatic  characterizations  is  independence  of  representation.  So  we  are  not 
committed  to  thinking  of  integers  as  strings  of  decimal,  or  binary,  or  Roman 
characters.  This  is  certainly  crucial  to  being  able  to  prove  correctness  of  data 
representations. 

Example  (2):  Another  specification  is  that  of  the  boolean  data  type.  The  E  alge- 
bra for  it  can  be  denoted  as  follows: 

S  =   {boot},  T,x,boti  =   {T,F}  ,  Sjoo,  j„„,   =   {not},  E,,^^,  booi .  booi  =   {And},  and 
E„  ,   =  $  otherwise. 

The  carriers  of  initial  E-algebras,  in  categories  of  algebras  satisfying  certain 
identities,  will  consist  of  equivalence  classes  of  E-terms,  and  the  familiar  methods 
of  algebra  (substitution  of  equals  for  equals,  reduction,  replacement,  etc.)  are 
crucial  for  our  proofs  of  correctness  of  data  type  specifications  and  for  our  ideas 
about  automatic  implementation  of  data  types  from  their  specifications. 

For  the  type  nat.  an  initial  algebra  T^  is  isomorphic  to  the  set 
u  =  {0,1,  ■  •  •  }  of  nonnegative  integers  by  the  correspondence  of  n  with  SUCC"(0), 
where  SUCC"  (0)  is  the  repetition  of  SUCC  n  times. 

We  want  to  constrain  initial  algebras  to  satisfy  certain  laws  or  equations. 
For  example,  we  want  the  binary  operation  TIMES  to  be  associative,  i.e.,  to 
satisfy 

TIMES(X,  (TIMES(Y,  Z))  =  TIMES  (TIMES(X,  Y).  Z) 
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To  make  clear  the  ideas  of  equation  and  satisfaction  requires  a  somewhat 
elaborate  preparation.  The  status  of  the  variables  (X,  Y,  Z  above)  has  to  be 
clarified.  The  basic  idea  is  that  for  each  sort  in  S ,  there  should  be  an  infinite 
supply  of  special  symbols  disjoint  fromi  any  signature.  To  get  variables  into  the 
terms  of  an  algebra,  we  can  consider  them  as  constants,  such  as  nullary  functions. 
We  see  here  that  every  variable  in  an  algebra  belongs  to  a  specific  sort.  Since  we 
know  each  operator  must  be  a  single  sort,  we  can  replace  a  variable  with  a 
corresponding  operator  in  a  given  term  (but  this  will  not  give  us  any  advantage 
in  our  work),  or  in  fact  any  operator  belonging  the  same  S„  ,  class  in  S  can  be 
changed  to  the  other.  This  is  called  a  substitution.  In  fact  as  we  will  see  in  the 
following  definition,  the  right  hand  side  of  an  equation  is  a  substitution  of  one 
side  by  the  other.  If  we  consider  the  right  hand  side  of  an  equation  to  be  simpler 
than  the  left,  then  the  number  of  operands  in  the  replacing  operator  must  be  less 
than  the  replaced  one.  Certainly,  if  the  replacing  operator  is  nullary,  this  will  be 
the  most  desirable  one. 

Definition:  A  S-equation  is  a  pair  e  =  <L  ,R  >  where  L,  R  are  terms  of  an  alge- 
bra A.  A  must  satisfy  the  equations.  The  necessary  condition  for  this  is  that 
number  of  variables  on  the  left  must  be  equal  those  on  the  right.  If  A  satisfies 
every  e  in  <f  then  such  a  set  of  equations  is  called  a  S-representation  (axiom  set 
of  A),  and  the  algebra  A  is  called  a  (E,^)-algebra,  and  the  category  of  (S,,^)- 
algebras  is  denoted  by  Cj;^. 

Let  Tj;,  be  an  initial  algebra  in  the  category  C^^  We  shall  say  that  Tj^^  is 
presented  by  ^.    The  construction  of  such  a  T^^  needs  some  machinery. 

Definition:  A  S-  congruence  =  on  a  S-algebra  is  a  familv  <  =  ,  >  of  equivalence 
relations,  =,    on   A,    {A,    is  the  carrier  set  of  sorts  .4,  )  Tor  «   e   S,  such   that  if 

F  f  S,  ,  ,,  and  if  a,  .a,  (  A,  and  if  a,  =,  a,  for  i  =  1,  .  .  .  ,  n,  then 
FaIoi,  ■  ■  ■  "^a^)  =,    F^iai    .  .  .  .  ,  a^). 

If  A  is  a  S-algebra  and  =  is  a  S-congnience  on  A  ,  let  (.4  /  =),  =  A,/  =,  be  the  set 
of  =, -equivalence  classes  of  A,.  For  a  t  A,,  let  [a],  denote  the  ^, -class 
containing  a  .  Note  that  each  element  of  a,  /  =,  is  of  the  form  jaj,  but  of  course 
the  choice  of  a  e  .4,  is  not  uniquely  determined.  The  idea  here  is  to  define  an 
algebra  by  partitioning  it  into  congruence  classes. 

The  definition  of  the  operation  F^  /  =  is  as  follows: 

(qO)  If  F  .  E,,.  ,  then  F^  /  .  =  [F^  ]; 
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(ql)  If  F  €  L,^      ,,  and  [o,  ]  e  (/I  /  =),.  then 

Fa/^{Wi\,  ■  ■  ■  ,Wn])  =  \FA{ax,  ■  ■  ■  ,aj|. 

This  is  also  the  definition  of  the  homomorphism  classes  by  the  operation 
F.  U  A  is  a  S-algebra  and  =  is  a  E-congruence  on  A  ,  then  A  /  =  is  a  £-algebra 
called  the  quotient  of  A   by  =. 

Let  ^  be  a  £-algebra,  and  let  R  be  a  relation  on  A  .  Then  there  is  a  least 
S-congruence  relation  on  A  containing  R;  it  is  called  the  congruence  relation 
generated  by  R  on  A. 

The  main  theorem  is  as  follows: 

Theorem  :  Let  f  be  a  S-representation  and  let  =f  be  the  S-congruence 
on  T-^  generated  by  ^[T^)-  Then  T^/  =f,  the  quotient  of  T^,  by  =f, 
hereafter  denoted  T-^^^-,  is  the  initial  algebra  in  the  category  C^^  of  all 
S-algebras  satisfying  ^. 

B.    SPECIFICATION  OF  ABSTRACT  DATA  TYPES 

In  this   section  we  will  give  specifications  in  which  the  set   ^  of  equations 
(axiom  set)  is  nonempty.  So  these  specifications  rely  upon  Theorem  1. 

1.      Specifications 

As  we  stated  earlier,  initial  algebras  may  consist  of  an  infinite  number  of 
objects.  We  want  to  find  convenient  ways  to  specify  them  in  finite  terms  so  we 
can  use  them  as  abstract  data  types.  By  starting  with  this  purpose,  a  formal 
definition  of  a  specification  is  as  follows: 

Definition:  A  specification  is  a  pair  <S,^>  where  S  is  a  composite  of  sort  set  S, 
and  an  S-sorted  signature  (note  that  E  is  extended  from  the  previous  meaning) 
and  ,f  is  a  set  of  S-equations. 

The  basic  idea,  here,  is  that  <E,^>  specifies  an  abstract  data  type  by  defining  the 
algebra  T^.,^- 

Sometimes  we  might  want  to  add  further  equations  on  an  existing  type, 
so  by  adding  the  equation  set  ^    on  the  algebra  Ty,<^  we  reach  T^  .  .     so  that  the 

new  type  is  a  quotient  of  the  old. 
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Let  us  define  a  specification  syntax  similar  to  one  defined  in  Davis  [1984]. 

SPECIFICATION    <Abs-data-type> 
OPERANDS 

<sort  set  S> 
OPERATIONS 
opl:  S'  -*  S 

opn:  S'  ^  S 
AXIOMS 

< equation  set  ,^> 

With  this  syntax,  a  specification  of  the  data  type  integer  is  as  follows: 

SPECIFICATION  integer 
OPERANDS 

int 
OPERATIONS 

0       :  -  int, 

SUCC  :  int  -  int, 

PRED  :  int  -  int, 
AXIOMS 

PRED(  SUCC(  X))  =  X, 

SUCC(  PRED(  X))  =  X 

where  SUCC  and  PRED  are  inverses  of  each  other,  and  X  is  a  free  variable  of 
sort  int. 

We    can    enrich    this    specification    by    adding    new    operations    to    the 
specification  without  disturbing  the  above  specification: 

SPECIFICATION 
OPERANDS 

int 
OPERATIONS 

0       :  ^  int, 
SUCC  :  int  -  int. 
PRED  :  int  ^  int. 
ADD    :  int,  int  ^  int 
SUB    :  int,  int  -^  int 
MULT  :  int,  int  -  int 
NEC    :  int  -  int 
AXIOMS 

PRED(  SUCC(  X))  =  X, 
SUCC(  PRED(  X))  =  X, 
ADD  (  X,  0)  =  X, 
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ADD  (  X,  Y)  =  ADD(  Y,  X), 

ADD  (  X,  ADD(  Y,  Z))  =  ADD(  ADD(  X,  Y),  Z), 

ADD  (  X,  SUCC(  Y))  =  SUCC  (  ADD(  X,  Y)), 

ADD  (  X,  PRED(  Y))  =  PRED  (  ADD(  X,  Y)), 

SUB  (  X,  0)  =  X, 

SUB  (  X,  SUCC(Y))  =  PRED  (  SUB(  X,  Y)), 

SUB  (  X,  PRED(Y))  =  SUCC  (  SUB(  X,  Y)), 

NEC  (0)  =  0, 

MULT  (  X,  0)  =  0, 

MULT  (  X,  SUCC(  Y))  =  ADD(  MULT(  X,  Y),  X), 

MULT  (  X,  PRED(  Y))  =  SUB(  MULT(  X,  Y),  X), 

So  the  operations  SUB,  ADD,  and  MULT  are  operations  derived  from  SUCC,  and 
PRED. 

2.     Extension 

Following  the  above  specification  of  integer,  we  may  add  new  operations 
which  involve  other  sorts.  For  example,  we  may  want  to  add  predicates, 
conditionals  or  relations  to  an  existing  type  which  does  not  have  the  boolean 
type.  In  some  sense  we  certainly  extend  the  syntax  above  for  specifications  as  in 
the  following  example  (for  the  sake  of  simplicity,  we  will  skip  the  parts  already 
written). 

SPECIFICATION  integer 
EXTEND 
boolean 
WITH 

OPERANDS    int 
OPERATIONS 


LTE  :  int,  int  -►  bool 
AXIOMS 


LTE  (X,  X)  =  TRUE. 

LTE  (X,  Y)  =  LTE(SUCC(X),  Y), 

LTE  (X,  Y)  =  LTE(SUB(X,  Y),  0) 

Certainly  X,  Y  are  free  variables  of  sort  int. 
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3.     Decidability  of  Equivalence 

For  a  given  specification  (S,,^),  when  is  the  equivalence  of  two  terms 
decidable? 

We  first  give  some  definitions.  A  term  in  a  specification  is  (a)  a  variable 
symbol,  or  (b)  a  function  followed  by  finite  number  of  terms  recursively.  The 
length  of  a  term  is  the  number  of  function  and  variable  symbols  in  it.  For 
instance, 

Length(f(x,  g(x,  y),h(z))  =  7 

If  we  can  prove  that  the  axioms  of  a  specification  are  globally  confluent,  then  it 
follows  that  equivalence  of  any  two  terms  is  decidable.  To  show  global 
confluence,  we  have  to  show  that  the  axioms  are  locally  confluent  and  finitely 
terminating. 

As  is  shown  by  the  following  two  theorems  from  Huet  and  Lankford 
[1978],  finite  termination  is  in  general  undecidable. 

Theorem  1:  The  finite  termination  problem  of  term  rewriting  systems  is 
undecidable  even  if  terms  are  restricted  to  unary  and  nullary  functions. 

Theorem  2:  There  is  no  decision  procedure  for  finite  termination  of  term 
rewriting  systems. 

Here,  the  theorems  are  based  on  term  rewriting  systems.  But  as  we  will  show  in 
the  next  section,  every  specification  is  also  a  term  rewriting  system. 

Because  of  these  results,  we  need  some  sufficient  conditions  to  guarantee 
finite  termination.  We  define  an  axiom  x  ^  y  as  nonexpanding  if  for  every 
substitution  on  both  x  and  y,  Length(^(x))  ^  Length(i9  (y))  (where  0  is  the 
substitution  prefix).  Proof  of  the  following  theorem  is  a  sufficient  condition  for 
the  finite  termination  of  the  rewrite  rules. 

Theorem  3.1:  If  a  rewriting  relation  ^  is  nonexpanding,  then  it  is  finitely 
terminating. 

Proof:  Since  the  relation  is  nonexpanding,  t  --'  u  implies  Length(u)  ^  Length(t). 
Also  the  only  variables  that  can  occur  in  u  are  those  in  t.    Thus  there  are  only 
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finitely  many  possibilities  for  u,  and  -  must  be  finitely  terminating  (Guttag, 
Kapur,  Musser  [1983]). 

As  an  example  let  us  work  on  specification  of  BOOLEAN. 

SPECIFICATION     Boolean 
OPERANDS    Bool 
OPERATIONS 

TRUE    :  -  Bool 

FALSE  :  -  Bool 

NOT     :  Bool  -  Bool 

AND     :  Bool,  Bool  -  Bool 
AXIOMS 

NOT(TRUE())  -  FALSEO 

NOT(NOT(x))  -  X 

AND(TRUE,  x)  -  X 

AND(FALSE,  x)  -  FALSE 

We  have  to  show  every  sequence  of  reductions  terminates.  We  induct  on 
reduction  of  a  term  with  length  k.  If  k  =  l,  then  it  holds  since  it  must  be  a 
nullary  operator  or  a  variable  which  is  considered  as  a  constant  of  the 
specification.  Assume  it  holds  for  all  terms  of  length  k  ^  n.  Consider  a  term  t  of 
length  n  +  1.  Then  it  is  in  one  of  the  forms: 
NOT(x)  or  AND(y) 

where  the  length  of  x  and  y  are  ^  n.  So  the  reductions  will  be  one  of  the 
following  forms: 

NOT(x)  -  x' 

AND(y)  ^  y' 

By  the  axioms  the  length  of  x'  and  y'  are  <  n.  This  completes  the  inductive 
proof. 

But  as  we  will  realize,  requiring  a  rewriting  relation  to  be  nonexpanding 
is  somewhat  restrictive.  Consider  if  we  added  another  operation  OR  to  above 
specification,  then  the  rule: 

AND(x,  OR(y,z))  ^  OR(AND(x,y).  AND(x,z)) 

will  be  expanding,  which  is  in  fact  very  useful. 
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The  next  step  for  decision  is  to  show  local  confluence  of  the  specification. 
This  is  the  topic  of  the  next  chapter  which  is  based  on  (Knuth  and  Bendix 
[1970]). 

4.     Correctness  of  Specifications 

Some  abstract  data  types,  i.e.,  natural  numbers,  integers,  strings  etc., 
have  previously  existing  mathematical  models.  Some  others  are  brought  by 
computer  science,  such  as  stack,  symboltable  etc.  For  these  there  may  be  found 
more  or  less  acceptible  mathematical  models.  Some  others  are  only  defined  by  the 
user  especially  for  his  program,  so  the  responsibility  belongs  to  them. 

If  there  is  a  mathematical  model,  it  is  necessary  to  find  a  strict  proof  if  it 
is  correct.  In  this  section  we  will  try  to  give  an  explanation  of  a  very  heavily  used 
method  for  proving  the  correctness  of  a  specification.  The  idea  here  is  that  a 
specification  (S,^)  is  correct  if  T^^  is  isomorphic  to  the  mathematical  model. 

Since  we  have  to  show  an  isomorphism,  correctness  proofs  can  be  viewed 
the  other  way  around,  that  is,  we  assume  (S.<f)  is  correct  and  show  a  model  is 
correct  by  isomorphism  to  T^^.  To  explain  the  idea,  let  us  give  an  example  on 
naturals. 

Earlier  we  said  that  the  signature  S  =  {0},  {SUCC},  <!>,...,  4>  where  i  =  4>, 
specifies  the  natural  numbers.  Let  tt'  =  {0,  1,  2,  .  .  .},  and  let  A  be  a  S-algebra 
with  0^  =0,  and  SUCCa  :n  -  n+1. 

Since  /I  is  a  S-algebra,  there  is  a  homomorphism  h:  T^  -^  A  where  Tv  is 
an  initial  algebra  in  set  C  of  S-algebra  classes.  What  we  have  to  show  is  that  h  is 
an  isomorphism.  So  A  is  an  initial  algebra.  Since  h  is  a  S-homomorphism,  h(0)  = 
0^  =  0  is  true.  If  h(5L^CC"  (0))=  n,  then  h(5r'CC"+'(0))=  SUCC^  [\\[SUCC"  [Q)))  = 
SUCC^  {n)=  n+1,  by  definition  of  homomorphism.  So  we  know  h  is  surjective.  It 
is  also  injective  since  n=p  implies  SUCC"  {0)=  SUCC''{0).  Thus  h  is  an 
isomorphism. 

The  proof  of  correctness  of  naturals  involved  no  equations,  that  is,  the 
equation  set  ^  was  empty.  To  prove  the  specification  (S,^)  where  ^  is  not  empty, 
we  need  some  further  development  of  our  methods.  The  idea  is  to  get  a  (S.s)- 
algebra  A  whose  carriers  consist  of  canonical  terms  and  then  to  show  that  A  is 
isomorphic  to  the  mathematical  model  M. 
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Definition:  We  say  that  a  S-algebra  A  is  a  canonical  S-term  algebra  if  A,  is  a 
subset  of  T^c  for  each  seS ,  and  if  F(< ,,  .  .  .  ,  t„  )  is  in  yi  ,  then  t,    is  in  A    and 

With  this  definition,  we  have  to  show  that  for  the  specification  (S,^)  there  exists 
an   initial    (E,^)-algebra   A    which   is   a  canonical  term  algebra.   For  doing  this, 
certainly  the  major  step  is  to  show  that  if  yi   is  a  canonical  E-term  algebra  then  A 
is  isomorphic  to  the  initial  algebra  T^^.  For  further  information  the  reader  may 
refer  to  Goguen  [1977]. 

Before  ending  this  section,  we  have  to  say  that  the  correctness  of 
specifications  involves  the  realizability  problem  of  specifications,  which  is  to 
decide  if  an  initial  algebra  for  a  specification  is  computable.  Equivalently,  the 
problem  is  to  determine  if  equality  of  terms  in  the  algebra  is  decidable. 

C.    TERM  REWRITING  SYSTEMS 

Term  rewriting  systems  are  a  very  powerful,  interesting  model  of 
computation.  They  have  been  widely  used  for  computation  in  formula 
manipulation  and  theorem  proving  systems,  such  as  program  optimization, 
program  manipulation,  and  also  may  be  used  to  represent  abstract  interpreters 
for  programming  languages. 

A  generalization  of  these  systems  consists  in  considering  rewritings  on 
equivalence  classes  of  terms,  defined  by  a  set  of  equations.  In  this  sense,  they  may 
be  used  to  define  abstract  data  types.  We  define  a  term  rewriting  system  R  over 
a  set  of  terms  T  as  a  finite  set  of  rewrite  rules,  each  of  the  form  \{x)  ->  r(x), 
where  1  and  r  are  terms  in  T  containing  variables  x.  As  we  realize,  the  set  of  rules 
is  composed  of  a  set  of  directed  equations,  going  from  left  to  right. 

To  transform  a  set  of  equations  ^  of  an  algebra  into  a  term  rewriting  rule  set 
we  may  follow  the  following  algorithm  as  explained  in  Huet  and  Oppen[l983|. 

Let  V(N)  be  the  variable  set  for  a  term  N  in  S,  and  M=N  be  an  equation  in  ^ 
then 

1)  If  V(M)  is  a  subset  of  V(N),  put  N  -  M  in  T, 

2)  If  V(N)  is  a  subset  of  V(M),  put  M  -  N  in  T, 

3)  Otherwise  if  {z,,  .  .  .  ,x  }  are  in  the  intersection  of  V(M)  and  V(N),  intro- 
duce in  S  a  new  operator  H  of  the  appropriate  type,  and  put  in  T  the  two  rules 
M  ^  P,  N  -^  P  with  P  =  H(ii,  .  .  .  ,  x„). 

Notice  that  V(M)   may  be  equal  to  V(N).  If  so,  we  apply  first  step.     The  third 
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step  may  require  E'  to  have  sorts  not  in  S;  It  may  then  be  necessary  to  add  extra 
constants  to  E  for  it  to  be  sensible.  Also  certain  E-algebras  which  were  models 
may  not  be  extendible  a^  S  -algebras,  since  the  corresponding  carriers  are  empty. 

The  fundamental  difference  between  equations  and  term  rewriting  rules  is 
that  equations  denote  equality  (which  is  symmetric)  whereas  term  rewriting 
systems  treat  equations  directionally  as  one-way  (left  to  right)  replacements. 
Before  going  further,  we  have  to  define  the  notion  of  a  critical  pair  as  defined  in 
Dershowitz  [1985]. 

Let  l(i)  -»  r(i)  and  Viy)  -►  r'(j/)  be  two  rules  in  T  whose  variables  f  and  y 
have  been  renamed,  if  necessary,  so  they  are  distinct.  We  write  l(i )  =  u(v)(i )  to 
indicate  that  l(x)  contains  the  (nonvariable)  subterm  v  embedded  in  the  context 
u.  We  say  that  1  overlaps  (or  superposes)  1',  if  l(i)=  u(v)(z)  and  there  is  (most 
general)  substitution  a  for  the  variables  x  and  y  such  that  v(ct)=  V{a].  In  that 
case,  the  overlapped  term  \{o)  can  be  rewritten  to  either  r(a)  or  u{T'){a).  These 
two  possibilties  are  called  a  critical  pair.  For  example,  the  two  rules  F(G(x,y,A)) 
->  H(x,y)  and  G(B,x,y)  -»  K(y,x)  determine  a  critical  pair  <F(K(A,x)),H(B,x)> 
shown  in  Figure  4.1.b. 


(b) 


Figure  4.1  Idea  of  a  critical  pair. 
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Five  desirable  properties  of  a  term  rewriting  system  are  as  follows  (as 
explained  in  Dershowitz[l985]): 

1)  Termination  -  no  infinite  derivations  are  possible, 

2)  Confluence  -  each  term  has  at  most  one  normal  form, 

3)  Soundness  -  terms  are  only  rewritten  to  equal  terms, 

4)  Completeness  -  equal  terms  have  the  same  normal  forms, 

5)  Correctness  -  all  normal  forms  satisfy  given  requirements. 

A  rewrite  system  T  is  canonical  for  an  equational  theory  E,  if  it  is  terminating, 
confluent,  sound  (with  respect  to  E),  and  complete  (with  respect  to  E).  Then  it 
can  be  used  to  decide  whether  an  equation  M=N  follows  from  the  axioms  in  E  by 
checking  whether  or  not  unique  normal  forms  of  M  and  N  are  the  same. 

Here  we  work  on  determination  of  confluence  of  a  term  rewriting  system  T. 
The  completeness  of  using  rewrite  rules  to  make  deductions  equationally  is 
specified  by  the  following  Church-Rosser  property  of  T. 

T  is  Church-Rosser  if  and  only  if.  for  all  M  and  X,  M  =  ^  ^'  if  ^^'^  only  if 
there  exists  a  P  such  that  M  ^'  P  and  X  — '  P. 

We  say  that  P  is  in  normal  form  (relative  to  T)  if  and  only  if  there  is  no  P* 
such  that  P  ^  P*.  that  is  no  subterm  of  an  instance  of  P  is  an  instance  of  a 
lefthand  side  of  a  rule  in  T.  We  say  P  is  T-normal  form  of  M  if  M  -'  P  and  P  is 
a  normal  form  relative  to  T.  When  T  is  Church-Rosser.  the  normal  form  of  a 
term  is  unique,  when  it  exists.  A  sufficient  condition  for  the  exist ance  of  such  a 
unique  normal  form  is  the  termination  of  all  rewritings. 

The  confluence  property  is  undecidable  for  an  arbitrary  term  rewriting 
system,  since  a  confluence  test  could  be  used  to  decide  the  equivalence,  for 
instance,  of  recursive  program  schemas  (Dershowitz  [1985]).  The  decidability  of 
confluence  for  ground  term  rewriting  systems  is  open.  We  say  that  term  o  is 
ground  if  and  only  if  V((7(x))=  o.  For  example,  O-I-SUCC(O)  is  a  ground  term  of 
sort  Nat. 

We  now  turn  to  decidability  of  confluence  for  finitely  terminating  term 
rewrite  systems.  The  general  theorem  3.1  proved  in  chapter  3  was  originally 
discovered  by  Xewman  for  rewriting  systems  (see  Newman  [1942]). 
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The  next  step  is  to  show  that  local  confluence  of  (finite)  term  rewriting 
system  is  decidable.  The  following  theorem  is  used  as  the  basis  for  an  algorithm 
to  decide  confluence  for  finitely  terminating  systems  (Knuth  and  Bendix  [1970]): 

Theorem  4.1:  A  terminating  rewrite  system  is  confluent  if  and  only  if  both 
terms  in  each  of  its  pairs  reduce  to  the  same  term. 

Combining  the  Knuth-Bendix  theorem  and  Newman's  theorem  gives  us  a 
decision  procedure  for  the  confluence  of  finitely  terminating  term  systems  with  a 
finite  number  of  rules.  When  such  a  system  T  satisfies  the  critical  pair  condition 
it  defines  a  canonical  form  for  the  corresponding  equational  theory  =  f .  We  then 
say  that  T  is  a  canonical  term  rewriting  system. 

The  next  chapter  is  based  on  an  explanation  the  algorithm  discovered  by 
Knuth  and  Bendix. 
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V.    AN  ALGORITHM  FOR  TESTING  FOR  CONFLUENCE 

So  far,  we  have  discussed  the  notion  of  term  rewriting  systems  and  their 
properties.  We  have  said  that  to  show  equivalence  of  any  two  terms  in  a  term 
rewrite  system  is  decidable  if  it  is  both  terminating  and  locally  confluent.  To 
complete  this,  we  have  to  find  a  way  to  test  a  system  for  confluence. 

The  Knuth-Bendix  theorem  gives  a  decision  procedure  for  the  confluence  of 
terminating  rewrite  systems.  The  basic  idea  is  to  consider  the  case  where  two 
left-hand  sides  in  a  term  rewriting  system  R  superpose  in  a  nontrivial  way  to 
create  an  ambiguity  of  the  form  M  -^  A^j,  M  -»  iVj  (then  iVj  and  N2  are  a  critical 
pair).  The  system  R  is  nonconfluent  if  and  only  if  some  such  pair,  N^  and  N^ 
reduce  to  distinct  R-normal  forms  Pj  and  P2  (Huet  [l98l]). 

The  Knuth-Bendix  completion  algorithm  attempts  to  transform  a 
nonconfluent  system  into  a  confluent  one  by  adding  new  rewrite  rules,  such  as 
Py  -»  ^2-  This  must  be  done  in  such  a  way  that  the  transformed  system  is  still 
terminating.  Certainly,  one  round  of  completion  is  not  sufficient  in  general,  since 
new  ambiguities  may  have  been  created.  During  this  completion  process,  some 
newly  introduced  rule  may  simplify  some  old  rule,  either  on  its  left  or  on  its 
right-hand  side.  It  is  essential,  both  for  efficiency  and  elegance,  to  keep  all  rules 
interreduced  as  much  as  possible.  But  then  the  question  arises  as  to  how  the 
process  can  be  carried  out  efficiently  in  an  incremental  fashion,  that  is.  we  do  not 
want  to  recompute  critical  pairs  between  rules  that  have  been  previously 
considered.  However,  the  rules  that  have  been  used  to  resolve  these  ambiguities 
may  not  exist  anymore,  and  so  this  step  must  be  carefully  justified.  When  a  set 
of  equations  can  be  oriented  so  that  the  completion  process  terminates,  the 
resulting  term  rewriting  system  defines  a  decision  procedure  for  the  equality 
problem  in  the  corresponding  system. (Huet  [1981]) 

Before  presenting  the  algorithm,  we  define  a  reduction  ordering  as  a  well- 
founded  partial  ordering  on  terms  closed  by  term  replacement  and  substitution. 
That  is,  M  >  N  implies  that  P[M]>P[N]  for  any  term  context  P[]  and  a(M)  > 
a(N)  for  any  substitution  o.  We  note  that  if  >  is  a  reduction  ordering  such  that 
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we  have  X> p  for  every  A  ->  p  in  R,  then  R  is  obviously  terminating.  The  set  of 
rewrite  rules  R  is  complete  if  and  only  if  it  is  locally  confluent  (Knuth  and  Bendix 
[1970],  Theorem  4). 

The  completion  algorithm  is  as  follows  (from  Huet  [l98l]): 

Initial  data:  A  (finite)  set  of  equations  ^,  and  a  (recursive) 

reduction  ordering  >. 
^0  :=  -e;  Ro  ■■=  ^;  i  :=  0;  p  :=  0; 
loop 

while  ^,   ^  <I>  do 

Reduce  equation:  Select  equation  M=N  in  ^, . 
Let  M',N'  be  R,  normal  forms  of  M,  N  respectively  obtained 
by  applying  rules  of  R,  in  any  order,  until  none  applies. 
If  M'=N'  then  e.+,  :=  ^,  -  {M=N}; 

R,^,  :=/?.;  i  :=  i+1; 
else  If  (M'>N')  then  begin 
A  :=  M';  p  :=  N'; 
else  A  :=  N';  p  :=  M';  endif; 

Add  new  rule:Let  K  be  the  set  of  labels  k  of  rules  of  /?, 
whose  left-hand  side  A^  is  reducible  by  A  —  /?  say  to  A^. 
^,+1  :=  ^.  -  {M=N}  u  Uk  1  k:  A,  ^  p,  is  in  R,  with  k  in  K}; 
p:=  p+1; 

^,-1  :=  (j:  ^,  ^  Pj  |j:  ^,  -^  P ,  in  R,  with  j  is  not  in  K}  [j  {p:  A  -  /?}. 
The  rules  coming  from  R,  are  marked  or  unmarked  as  they 
were  in  R, ,  the  new  rule  A  ^  />  is  unmarked; 
i:=  i+1:  ,      '       ■. 

end 
else  exitloop  (failure)  endif 
endwhile: 
Compute  critical  pairs:  If  all  rules  in  R,   are  marked,  exitloop 
(/?,   is  confluent  and  terminating  in  other  terms  it  is  complete.) 
Otherwise  select  an  unmarked  rule  in  R, ,  say  with  label  k.  Let  ^,^, 
be  the  set  of  all  critical  pairs  computed  between  rule  k  and  any  rule  of 
rule  of  R,  of  label  not  greater  than  k. 

Let  /?,  ^1  be  the  same  as  R, ,  except  that  rule  k  is  now  marked. 
i:=  i+1; 
endloop. 

When  given  a  finite  set  of  equations  .^  and  a  reduction  ordering  >  on  terms,  the 
completion  algorithm  may  stop  with  success,  stop  with  failure  or  loop  forever. 
When  it  stops  with  failure,  either  the  algorithm  should  be  tried  again  with  a 
different  ordering  that  will  order  the  two  terms  M',  N'  which  were  incomparable; 
or  some  new  function  symbol  should  be  added  with  a  definition  in  <f  that  will 

49 


reduce  M'  or  N',  or  else  the  method  is  not  applicable,  (see  Lemma  2  in  Huet 
[1981]) 

The  following  examples  of  the  algorithm  are  taken  from  Knuth  and  Bendix 
[1970],  and  were  programmed  for  computation  in  FORTRAN  IV  on  an  IBM 
7094. 

Example  1.  Group  theory  I.  The  first  example  is  the  traditional  definition  of 
an  abstract  group.  Here  we  have  three  operators:  A  binary  operator  .,  a  unary 
operator  -,  and  a  nullary  operator  e  ,  satisfying  the  following  three  axioms. 

1.  e   .a  -  a.  (Left  identity) 

2.  a~ .a  ->  e  .  (Inverse  for  all  elements  in  group) 

3.  (a.b).c  ->  a.(b.c).     (Multiplication  is  associative) 

The  procedure  was  first  carried  out  by  hand,  to  see  if  it  would  succeed  in  deriving 
the  identities  a.e  =  a,  a~~  =  a  etc.,  without  making  use  of  any  more  ingenuity 
than  can  normally  be  expected  of  a  computer's  brain.  (From  now  on  we  will  use 
ab ,  instead  of  a.b  for  simplicity)  The  success  of  this  hand-computation 
experiment  provided  the  initial  incentive  to  create  a  computer  program,  so  that 
experiments  on  other  axiom  systems  could  be  performed. 

When  the  computer  program  was  finally  completed,  the  machine  treated  the 
above  three  axioms  as  follows:  First  axioms  1  and  2  were  found  to  be  complete, 
by  themselves;  but  when  A,  =  a"o  of  axiom  2  was  superposed  on  fi  =  ab  of 
Aj  =  {ab)c  of  axiom  3,  the  resulting  formula  (a~a)b  could  be  reduced  in  two  ways 
as 

{a^ a  )b    —>  a~  (ab  ) 

(a~  a)b    ->  eb    —>  b  . 
Therefore  a  new  axiom  is  added, 

4.  a-{ab)  -  b 

Axiom  1  was  superposed  on  the  subterm  ab  of  this  new  axiom,  and  another 
axiom  resulted: 

5.  e'  a    —>  a  . 
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The  computation  continued  as  follows 

6.  a      e  -►a  from  2  and  4, 

7.  a      b  ^  ab        from  6  and  3, 

Now  axiom  6  was  no  longer  irreducible  and  it  was  replaced  by 

8.  ae    ->  a 

Thus,  the  computer  found  a  proof  that  e  is  a  right  identity;  the  proof  is 
essentially  the  following,  if  reduced  to  applications  of  axioms  1,  2,  and  3: 

ae    =   (ea  )e    =   [[a~~a^)a)e    = 

[a~^  [a~  a)]e    =  {a~~t]e    =  a       (ee  )  = 

a"  t    =  a~'  {a~  a  )  =   (a"  a~)a   = 

ea   =  a 

This  ten-step  proof  is  apparently  the  shortest  possible  one. 
The  completion  continued  further: 

9.  e"  ^  e  from  2  and  8, 
(Now  axiom  5  disappeared.) 

10.  a~~  ^  a  from  7  and  8, 
(Now  axiom  7  disappeared.) 

11.  aa~  -^  e  from  10  and  2, 

12.  a(b  {ab)~)  ->  e       from  3  and  11, 

13.  a  {ab)  ^  b  froni  11  and  3, 

So  far.  the  computation  was  done  almost  as  a  professional  mathematician  would 
have  performed  things.  The  axioms  present  at  this  point  were  1,  2.  3.  4.  8.  9,  10. 
11,  12,  13:  These  do  not  form  a  complete  set,  and  the  ensuing  computation 
reflected  the  computer's  grouping  for  the  right  way  to  complete  the  set: 

14.  {ab  )'(a  [be ))  -»  c  from  3  and  4, 

15.  b(c  ((be)- a))  -^  a  from  13  and  3, 
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16.  6  (c(o  (6  (ca))-))  ^  e        from  12  and  3, 

17.  a  (ba)~  -6 ~  from  12  and  4,  using  8, 

18.  b  ((ab  )' c  )  -^  a~c  from  17  and  3, 
(Now  axiom  15  disappeared.) 

19.  6  (c  (a  (6c))-)  ^  a  from  17  and  3, 
(Now  axiom  16  disappeared.) 

20.  [ab]' ^  a' b-  from  17  and  4. 

At  this  point,  axioms  12,  14,  18,  and  19  disappeared,  and  the  resulting  complete 
set  of  axioms  included  the  axioms  1,  2,  3,  4,  8,  9,  10,  11,  13,  and  20.  A  study  of 
those  ten  reductions  shows  that  they  suffice  to  solve  the  term  problem  for  free 
groups  with  no  relations;  two  terms  formed  with  the  operators  .,  -,  and  e  can  be 
proved  equivalent  as  a  consequence  of  axioms  1,  2,  and  3  if  and  only  if  they 
reduce  to  the  same  irreducible  term,  when  the  above  ten  reduction  are  applied  in 
any  order.  The  computation  took  30  seconds. 

Example  2.    Group  theory  II.    Suppose  we  start  as  in  Example  1  but  with  left 
identity  and  left  inverse  replaced  by  right  identity  and  right  inverse: 

1.  ae    —  a 

2.  aa'    ->  e 

3.  (ab  )c    ^  a  (be  ) 

It  should  be  emphasized  that  the  computational  procedure  is  not  symmetrical 
between  right  and  left,  due  to  the  nature  of  the  well-ordering,  so  that  this  is  quite 
a  different  problem  from  Example  1.  In  this  case,  axiom  1  combined  with  axiom  3 
generates  a(eb)  —  ab  .  which  has  no  analog  in  the  system  of  Example  1.  The 
computer  found  this  system  slightly  more  difficult  than  the  system  of  Example  1: 
24  axioms  were  generated  during  the  computation,  of  which  8  did  not  participate 
in  the  final  set  of  reductions.  It  took  40  seconds. 

Example  3.    Inverse  property.    Suppose  we  have  only  two  operators  .  and  -  as 
in  the  previous  examples  and  suppose  that  only  the  single  axiom 

1.  a~(ab)  -  b 
is  given.  No  associative  law.  etc.,  is  assumed. 
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This  example  can  be  worked  by  hand:  First  we  superpose  a'(ab)  onto  its 
component  (a6),  obtaining  the  term  a~~(a' {ab))  which  can  be  reduced  both  to  ab 
and  to  a     b  .  This  gives  us  a  second  axiom 

2.  a^~  b    ->  ab 

as  a  consequence  of  axiom  1.  Now  a~{ab)  can  be  superposed  onto  a^6  ;  we  obtain 
the  term  a'~(a~b )  which  reduced  to  6  by  axiom  1,  and  to  a  (a'  b)  by  axiom  2. 
Thus,  a  third  axiom 

3.  a  (a'  b  )  -►  6 

is  generated.  It  is  interesting  (and  not  well  known)  that  axiom  3  follows  from 
axiom  1  and  no  other  hypothesis.  This  fact  can  be  used  to  simplify  several  proofs 
which  appear  in  literature,  for  example  in  the  algebraic  structures  associated  with 
projective  geometry. 

A  rather  tedious  fiirther  consideration  about  ten  more  cases  shows  that 
axioms  1,  2,  3  form  a  complete  set.  Thus,  we  can  show  that  a~~b  =  ab  is  a 
consequence  of  axiom  1,  but  we  cannot  prove  that  a'=  a  without  further 
assumptions. 

Some  other  examples  given  by  Knuth  and  Bendix  explain  how  a  random 
axiom  set  can  cause  the  system  to  degenerate  by  creating  a  certain  illogical 
complete  set  (see  Knuth  and  Bendix  [1970]  for  detail).  There  are  also  some 
weaknesses  in  the  Knuth-Bendix  completion  procedure.  The  following  example  is 
given  to  exhibit  one  of  them  (example  18  of  Knuth  and  Bendix  [1970]). 

Example  4.  Some  unsuccessful  experiments.  The  major  restriction  of  the 
present  system  is  that  it  cannot  handle  systems  in  which  there  is  a  commutative 
binary  operator  (for  example  for  an  abelian  group),  where 

a.b    =   b.a 

Since  we  have  no  way  of  deciding  in  general  how  to  construe  this  as  a 
"reduction",  the  method  must  be  supplemented  with  additional  technic^ues  to 
cover  this  case.  Presumably  an  approach  could  be  worked  out  in  which  we  use 
two  reductions 

a  ->  0  and  /?  —  Q 
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whenever  a  =  0  but  a  is  not  compatible  with  0,  and  to  make  sure  that  no  infinite 
looping  occurs  when  reducing  ternas  to  a  new  kind  of  "irreducible"  form.  At  any 
rate  it  is  clear  that  the  methods  in  which  this  algorithm  is  involved  ought  to  be 
extended  to  such  cases,  so  that  rings  and  other  varieties  may  be  studied. 
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VI.    CONCLUSION 

In  our  survey  here,  an  important  fact  that  we  have  briefly  mentioned  is  the 
undecidability  problem  of  termination  for  a  set  of  rewrite  rules.  Since  this 
property  is  the  main  step  on  the  way  to  proving  confluence  of  a  given  set  of 
rewrite  rules,  we  have  to  deal  with  sufficient  conditions  for  establishing 
termination.  There  is  some  recent  work  on  this  problem.  One  approach  fmds  a 
cycle  if  one  exists. This  is  discussed  in  the  paper  by  J.  V.  Guttag,  D.  Kapur,  and 
D.  R.  Musser  [1983].  This  procedure  is  an  initial  step  in  this  area  but  not  efficient 
enough.  Such  problems  could  be  the  subject  of  other  thesis  research. 

Most  attempts  to  apply  confluence  have  been  limited  because  of  our  inability 
to  solve  other,  more  complex  problems,  such  as  termination.  Of  course,  this 
reflects  the  ongoing  need  for  broadening  our  understanding  of  these  problems. 
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